## # A tibble: 9 × 3
## # Groups:   type [3]
##   type       month mean_value
##   <chr>      <dbl>      <dbl>
## 1 both           1       14.5
## 2 both           2       13.4
## 3 both          12       13.1
## 4 commercial     1       13.1
## 5 commercial     7       13.1
## 6 commercial    12       12.7
## 7 oscar          1       12.0
## 8 oscar          2       11.6
## 9 oscar          3       11.0

## # A tibble: 25 × 3
## # Groups:   cluster [5]
##    name             cluster distance
##    <chr>              <int>    <dbl>
##  1 Chris Hemsworth        1   0.222 
##  2 Nicolas Cage           1   0.232 
##  3 Lyna                   1   0.234 
##  4 Sam Elliott            1   0.240 
##  5 Mads Mikkelsen         1   0.248 
##  6 Maria Bakalova         2   0.0219
##  7 Chadwick Boseman       2   0.0225
##  8 Irrfan Khan            2   0.0248
##  9 Jean Dujardin          2   0.0275
## 10 Carrie Fisher          2   0.0293
## # ℹ 15 more rows
## # A tibble: 5 × 3
## # Groups:   cluster, nominee [5]
##   cluster nominee prop_nominees
##     <int>   <dbl>         <dbl>
## 1       1       1         0.246
## 2       2       1         0.454
## 3       3       1         0.408
## 4       4       1         0.222
## 5       5       1         0.298
## Error in ar.burg.default(x, aic = aic, order.max = order.max, na.action = na.action,  : 
##   zero-variance series
## Error in ar.burg.default(x, aic = aic, order.max = order.max, na.action = na.action,  : 
##   zero-variance series
## Error in ar.burg.default(x, aic = aic, order.max = order.max, na.action = na.action,  : 
##   zero-variance series
## Error in if (order) coefs[order, 1L:order] else numeric() : 
##   argument is not interpretable as logical
## Error in ar.burg.default(x, aic = aic, order.max = order.max, na.action = na.action,  : 
##   zero-variance series
## Error in ar.burg.default(x, aic = aic, order.max = order.max, na.action = na.action,  : 
##   zero-variance series
## Error in ar.burg.default(x, aic = aic, order.max = order.max, na.action = na.action,  : 
##   zero-variance series
## Error in ar.burg.default(x, aic = aic, order.max = order.max, na.action = na.action,  : 
##   zero-variance series
## 
##   0   1 
## 317 317
## 
##   0   1 
## 317 317
## 
## Call:
##  randomForest(formula = nominee ~ ., data = train_data) 
##                Type of random forest: classification
##                      Number of trees: 500
## No. of variables tried at each split: 4
## 
##         OOB estimate of  error rate: 28.06%
## Confusion matrix:
##     0   1 class.error
## 0 180  73   0.2885375
## 1  69 184   0.2727273
##          Actual
## Predicted  0  1
##         0 46 19
##         1 18 45
## [1] "Accuracy:  0.7109375"
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  0  1
##          0 46 19
##          1 18 45
##                                           
##                Accuracy : 0.7109          
##                  95% CI : (0.6242, 0.7876)
##     No Information Rate : 0.5             
##     P-Value [Acc > NIR] : 1.003e-06       
##                                           
##                   Kappa : 0.4219          
##                                           
##  Mcnemar's Test P-Value : 1               
##                                           
##             Sensitivity : 0.7188          
##             Specificity : 0.7031          
##          Pos Pred Value : 0.7077          
##          Neg Pred Value : 0.7143          
##              Prevalence : 0.5000          
##          Detection Rate : 0.3594          
##    Detection Prevalence : 0.5078          
##       Balanced Accuracy : 0.7109          
##                                           
##        'Positive' Class : 0               
## 
##                      MeanDecreaseGini
## trend                       15.317558
## spike                       13.083589
## linearity                   55.017603
## curvature                   23.670419
## e_acf1                      13.348541
## e_acf10                     11.671731
## entropy                     10.823001
## x_acf1                      12.192964
## x_acf10                     12.718348
## diff1_acf1                  16.797037
## diff1_acf10                 12.562506
## diff2_acf1                  13.624891
## max_spike_height            11.072804
## nominated_previously         2.918732
## age                         12.382291
## gender                       1.575557
## american                     1.636712
## cluster                     10.202593
## won_previously               1.688494

## 
## Call:
##  randomForest(formula = nominee ~ ., data = train_data) 
##                Type of random forest: classification
##                      Number of trees: 500
## No. of variables tried at each split: 3
## 
##         OOB estimate of  error rate: 28.46%
## Confusion matrix:
##     0   1 class.error
## 0 182  71   0.2806324
## 1  73 180   0.2885375
##          Actual
## Predicted  0  1
##         0 46 17
##         1 18 47
## [1] "Accuracy:  0.7265625"
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  0  1
##          0 46 17
##          1 18 47
##                                           
##                Accuracy : 0.7266          
##                  95% CI : (0.6408, 0.8016)
##     No Information Rate : 0.5             
##     P-Value [Acc > NIR] : 1.492e-07       
##                                           
##                   Kappa : 0.4531          
##                                           
##  Mcnemar's Test P-Value : 1               
##                                           
##             Sensitivity : 0.7188          
##             Specificity : 0.7344          
##          Pos Pred Value : 0.7302          
##          Neg Pred Value : 0.7231          
##              Prevalence : 0.5000          
##          Detection Rate : 0.3594          
##    Detection Prevalence : 0.4922          
##       Balanced Accuracy : 0.7266          
##                                           
##        'Positive' Class : 0               
## 
##                  MeanDecreaseGini
## trend                   17.644047
## spike                   14.426482
## linearity               55.662106
## curvature               25.216108
## e_acf1                  14.118445
## e_acf10                 13.641810
## entropy                 15.322161
## x_acf1                  15.472704
## x_acf10                 13.446183
## diff1_acf1              15.663549
## diff1_acf10             14.007693
## diff2_acf1              14.177337
## max_spike_height        13.110981
## cluster                  9.766231